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Abstract: Data mining is the process that attempts 
to discover patterns in large data sets. The actual 
data mining task is the automatic or semi- 
automatic analysis of large quantities of data to 
extract previously unknown interesting patterns 
such as groups of data records i.e.cluster analysis, 
unusual records (anomaly detection) and 
dependencies association rule mining. This usually 
involves using database techniques such as spatial 
indexes. These patterns can then be seen as a kind 
of summary of the input data, and may be used in 
further analysis or, for example, in machine 
learning and predictive analytics. As the internet 
has been involved in all areas of human activity, 
there are increasing concerns that data mining 
may pose a threat to our privacy and security then 
security would be one of the major issues to 
monitor. In this paper we present recent research 
on data mining and its security. We prepare a 


survey report on data mining for crime detection. 
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I. INTRODUCTION 


Data mining is the process of discovering new 
patterns from large data sets involving methods at the 
intersection of artificial intelligence, machine 
learning, statistics and data base system. It is the 
process of analyzing data from different perspectives 
and summarizing it into useful information, 
information that can predict the success of a 
marketing campaign, looking for patterns in financial 
transactions to discover illegal activities or analyzing 


genome sequences.[1] 


For mining decisions data can be grouped according 


to the following categories: 


¢Data classes: Stored data is used to locate data in 


predetermined groups. 


*Data clusters: Data items are grouped according to 


logical relationships or consumer preferences. 


*Data associations: Data can be mined to identify 


associations. 


eSequential patterns: Data is mined to anticipate 


behavior patterns and trends. 
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II. POSSIBLE THREATS TO SECURITY 


A.Predict information about classified work from 


correlation with unclassified work: 


Classification is a data mining technique used to 
predict group membership for data instances in which 
data instances are classified based on their feature 
values. Predictive analysis could be applied to predict 
future patterns by providing a record of the past that 
can be analyzed more effectively on classified data. 
Unclassified work may involve duplicate and 


redundant data which is difficult to manage.[2] 


A correlation is an index of the strength of the 


relationship between two variables. 


B.Detect “hidden” information based on 


“conspicuous” lack of information: 


Data mining techniques are basically used in 
detecting hidden information from the large amount 
of database. Query generators and data interpretation 
components combine with discovery driven systems 
to reveal hidden data. 

C. Mining “Open Source” data to determine 


predictive events: 


Predictive analysis is a way to use data to predict 
future patterns. It is an area of statistical analysis that 
deals with extracting information from data and using 
it to predict future trends and behavior patterns. The 
core of predictive analytics relies on capturing 
relationships between explanatory variables and the 
predicted variables from past occurrences, and 


exploiting it to predict future outcome. 
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DJntrusion Detection 


An intrusion can be defined as "any set of actions 
that attempt to compromise the integrity, 
confidentiality or availability of a resource". Intrusion 
prevention techniques, such as user authentication 
(e.g. using passwords or biometrics), avoiding 
programming errors, and information protection (e.g., 
encryption) have been used to protect computer 
systems as a first line of defense.[5] Intrusion 
detection system produces reports and intrusion 
prevention system is placed in-line and is able to 
actively prevent or block intrusions that are detected. 
Intrusion detection systems are to identify malicious 
activity, log information about said activity and 


report activity.[2] 


Itl.TO IMPROVE SECURITY 


¢ For privacy concerns, one should be only authorized 
access to privacy sensitive information such as credit 
card transaction records, health care records, 
biological traits, criminal investigation and ethnicity. 
So various data mining enhancing techniques have 
been developed to help protecting data. Databases 
can employ a multilevel security model to classify 
and restrict data according to various security levels, 
with user permitted access to only their authority 


levels.[2] 


¢ For security concerns, data mining can be used for 
crime detection and prevention using various 
techniques such as TIA program ( Terrorism 
Information awareness) this project was to focus on 
three specific areas of research i.e. language 
translation, data search with pattern recognition and 


privacy protection, and advanced collaborative and 
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decision supportive tools.[9] CAPPS-II ( Computer - 
assisted Passenger Prescreening System), In this 
system, When a person books a plane ticket, certain 
identifying information is collected by the airline: full 
name, address, etc. This information is used to check 
against some data store (e.g., a TSA No-Fly list, 
the FBI ten most wanted fugitive list etc.) and assign 
a terrorism "risk score" to that person. High risk 
scores require the airline to subject the person to 
extended baggage and/or personal screening, and to 
contact law enforcement if necessary. MATRIX 
(Multistate Anti-terrorism Information Exchange) 
which leverages advanced computer management 
capabilities to more quickly access, share and analyze 
public records to help law enforcement generate 
leads, expedite investigations, and possibly prevent 


terrorist attacks.[3] 


IV. CONCLUSION 


Though data mining involves data analysis tools to 
discover previously unknown valid patterns and 
relationships in large data sets, and in TIA (Terrorism 
Information Awareness) program, a data mining 
application is designed to identify potential terrorist 
suspects in a large pool of individuals using statistical 
approach in which the user is tested against the 
predesigned model that includes information about 
known terrorists. However, while possibly re- 
affirming a particular profile, it does not necessarily 
mean that the application will identify an individual 
whose behavior significantly deviates from the 
original model or an individual may be considered as 
a suspect if some information is found same as in 


original model. 


V .FUTURE WORK 
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We are in initial stage of our research, much remains 


to be done including the following task: 


In TIA program person identification must not based 
on statistical approach i.e. comparing with a standard 
model and known behavioral patterns , we are trying 
to design some technology based analysis tool for 


Terrorism Information Awareness program. 
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